LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification
نویسندگان
چکیده
Abstract Background Long noncoding RNAs (lncRNAs) play important roles in various biological and pathological processes. Discovery of lncRNA–protein interactions (LPIs) contributes to understand the functions mechanisms lncRNAs. Although wet experiments find a few between lncRNAs proteins, experimental techniques are costly time-consuming. Therefore, computational methods increasingly exploited uncover possible associations. However, existing have several limitations. First, majority them were measured based on one simple dataset, which may result prediction bias. Second, applied identify relevant data for new (or proteins). Finally, they failed utilize diverse information proteins. Results Under feed-forward deep architecture gradient boosting decision trees (LPI-deepGBDT), this work focuses classify unobserved LPIs. three human LPI datasets two plant arranged. features proteins extracted by Pyfeat BioProt, respectively. Thirdly, dimensionally reduced concatenated as vector represent an pair. composed forward mappings inverse is developed predict underlying linkages LPI-deepGBDT compared with five classical models (LPI-BLS, LPI-CatBoost, PLIPCOM, LPI-SKF, LPI-HNM) under cross validations lncRNAs, pairs, It obtains best average AUC AUPR values situations, significantly outperforming other identification methods. That is, AUCs computed 0.8321, 0.6815, 0.9073, respectively AUPRs 0.8095, 0.6771, 0.8849, The results demonstrate powerful classification ability LPI-deepGBDT. Case study analyses show that there be GAS5 Q15717, RAB30-AS1 O00425, LINC-01572 P35637. Conclusions Integrating ensemble learning hierarchical distributed representations building multiple-layered architecture, improves performance well effectively probes interaction lncRNAs/proteins.
منابع مشابه
TF Boosted Trees: A Scalable TensorFlow Based Framework for Gradient Boosting
TF Boosted Trees (TFBT) is a new open-sourced framework for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to preve...
متن کاملA conjugate gradient based method for Decision Neural Network training
Decision Neural Network is a new approach for solving multi-objective decision-making problems based on artificial neural networks. Using inaccurate evaluation data, network training has improved and the number of educational data sets has decreased. The available training method is based on the gradient decent method (BP). One of its limitations is related to its convergence speed. Therefore,...
متن کاملBoosting Lazy Decision Trees
This paper explores the problem of how to construct lazy decision tree ensembles. We present and empirically evaluate a relevancebased boosting-style algorithm that builds a lazy decision tree ensemble customized for each test instance. From the experimental results, we conclude that our boosting-style algorithm significantly improves the performance of the base learner. An empirical comparison...
متن کاملBoosting Decision Trees
A new boosting algorithm of Freund and Schapire is used to improve the performance of decision trees which are constructed usin: the information ratio criterion of Quinlan’s C4.5 algorithm. This boosting algorithm iteratively constructs a series of decision tress, each decision tree being trained and pruned on examples that have been filtered by previously trained trees. Examples that have been...
متن کاملSmart City Mobility Application—Gradient Boosting Trees for Mobility Prediction and Analysis Based on Crowdsourced Data
Mobility management represents one of the most important parts of the smart city concept. The way we travel, at what time of the day, for what purposes and with what transportation modes, have a pertinent impact on the overall quality of life in cities. To manage this process, detailed and comprehensive information on individuals' behaviour is needed as well as effective feedback/communication ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2021
ISSN: ['1471-2105']
DOI: https://doi.org/10.1186/s12859-021-04399-8